Approximate Range Querying over Sliding Windows
نویسندگان
چکیده
In the context of Knowledge Discovery in Databases, data reduction is a pre-processing step delivering succinct yet meaningful data to sequent stages. If the target of mining are data streams, then it is crucial to suitably reduce them, since often analyses on such data require multiple scans. In this chapter, we propose a histogram-based approach to reducing sliding windows supporting approximate arbitrary (i.e., non biased) range-sum queries. The histogram is based on a hierarchical structure (as opposed to the flat structure of traditional ones) and it results suitable to directly support hierarchical queries, such as drill-down and roll-up operations. In particular, both sliding window shifting and quick query answering operations are logarithmic in the sliding window size. Experimental analysis shows the superiority of our method in terms of accuracy w.r.t. the state-of-the-art approaches in the context of histogram-based sliding window reduction techniques.
منابع مشابه
Querying Sliding Windows Over Online Data Streams
A data stream is a real-time, continuous, ordered sequence of items generated by sources such as sensor networks, Internet traffic flow, credit card transaction logs, and on-line financial tickers. Processing continuous queries over data streams introduces a number of research problems, one of which concerns evaluating queries over sliding windows defined on the inputs. In this paper, we descri...
متن کاملQuerying Regular Languages over Sliding Windows
We study the space complexity of querying regular languages over data streams in the sliding window model. The algorithm has to answer at any point of time whether the content of the sliding window belongs to a fixed regular language. A trichotomy is shown: For every regular language the optimal space requirement is either in Θ(n), Θ(logn), or constant, where n is the size of the sliding window...
متن کاملReducing Data Stream Sliding Windows by Cyclic Tree-Like Histograms
Data reduction is a basic step in a KDD process useful for delivering to successive stages more concise and meaningful data. When mining is applied to data streams, that are continuous data flows, the issue of suitably reducing them is highly interesting, in order to arrange effective approaches requiring multiple scans on data, that, in such a way, may be performed over one or more reduced sli...
متن کاملQuerying languages over sliding windows
We study the space complexity of querying languages over data streams in the sliding window model. The algorithm has to answer at any point of time whether the content of the sliding window belongs to a fixed regular language. For regular languages, a trichotomy is shown: For every regular language the optimal space requirement is asymptotically either constant, logarithmic, or linear in the si...
متن کاملSketch-based Querying of Distributed Sliding-Window Data Streams
While traditional data-management systems focus on evaluating single, adhoc queries over static data sets in a centralized setting, several emerging applications require (possibly, continuous) answers to queries on dynamic data that is widely distributed and constantly updated. Furthermore, such query answers often need to discount data that is “stale”, and operate solely on a sliding window of...
متن کامل